智能论文笔记

Towards mapping the contemporary art world with ArtLM: an art-specific NLP model

Qinkai Chen , Mohamed El-Mennaoui , Antoine Fosset , Amine Rebei , Haoyang Cao , Christy Eóin O'Beirne , Sasha Shevchenko , Mathieu Rosenbaum

分类：自然语言处理 | 机器学习

2022-12-14

With an increasing amount of data in the art world, discovering artists and artworks suitable to collectors' tastes becomes a challenge. It is no longer enough to use visual information, as contextual information about the artist has become just as important in contemporary art. In this work, we present a generic Natural Language Processing framework (called ArtLM) to discover the connections among contemporary artists based on their biographies. In this approach, we first continue to pre-train the existing general English language models with a large amount of unlabelled art-related data. We then fine-tune this new pre-trained model with our biography pair dataset manually annotated by a team of professionals in the art industry. With extensive experiments, we demonstrate that our ArtLM achieves 85.6% accuracy and 84.0% F1 score and outperforms other baseline models. We also provide a visualisation and a qualitative analysis of the artist network built from ArtLM's outputs.

translated by 谷歌翻译

SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild

Jie Qin , Shuaihang Yuan , Jiaxin Chen , Boulbaba Ben Amor , Yi Fang , Nhat Hoang-Xuan , Chi-Bien Chu , Khoi-Nguyen Nguyen-Ngoc , Thien-Tri Cao , Nhat-Khang Ngo

分类：计算机视觉

2022-07-11

基于草图的3D形状检索（SBSR）是一项重要但艰巨的任务，近年来引起了越来越多的关注。现有方法在限制设置中解决了该问题，而无需适当模拟真实的应用程序方案。为了模仿现实的设置，在此曲目中，我们采用了不同级别的绘图技能的业余爱好者以及各种3D形状的大规模草图，不仅包括CAD型号，而且还可以从真实对象扫描的模型。我们定义了两个SBSR任务，并构建了两个基准，包括46,000多个CAD型号，1,700个现实型号和145,000个草图。四个团队参加了这一轨道，并为这两个任务提交了15次跑步，由7个常用指标评估。我们希望，基准，比较结果和开源评估法会在3D对象检索社区中促进未来的研究。

translated by 谷歌翻译

Identifiability in inverse reinforcement learning

Haoyang Cao , Samuel N. Cohen , Lukasz Szpruch

分类：机器学习

2021-06-07

逆钢筋学习尝试在马尔可夫决策问题中重建奖励功能，使用代理操作的观察。正如Russell [1998]在Russell [1998]的那样，问题均为不良，即使在存在有关最佳行为的完美信息的情况下，奖励功能也无法识别。我们为熵正则化的问题提供了解决这种不可识别性的分辨率。对于给定的环境，我们完全表征了导致给定政策的奖励函数，并证明，在两个不同的折扣因子下或在足够的不同环境下给出了相同奖励的行动的示范，可以恢复不可观察的奖励。我们还向有限视野进行时间均匀奖励的一般性和充分条件，以及行动无关的奖励，概括Kim等人的最新结果。[2021]和Fu等人。[2018]。

translated by 谷歌翻译

Boosting Neural Networks to Decompile Optimized Binaries

Ying Cao , Ruigang Liang , Kai Chen , Peiwei Hu

分类：机器学习

2023-01-03

Decompilation aims to transform a low-level program language (LPL) (eg., binary file) into its functionally-equivalent high-level program language (HPL) (e.g., C/C++). It is a core technology in software security, especially in vulnerability discovery and malware analysis. In recent years, with the successful application of neural machine translation (NMT) models in natural language processing (NLP), researchers have tried to build neural decompilers by borrowing the idea of NMT. They formulate the decompilation process as a translation problem between LPL and HPL, aiming to reduce the human cost required to develop decompilation tools and improve their generalizability. However, state-of-the-art learning-based decompilers do not cope well with compiler-optimized binaries. Since real-world binaries are mostly compiler-optimized, decompilers that do not consider optimized binaries have limited practical significance. In this paper, we propose a novel learning-based approach named NeurDP, that targets compiler-optimized binaries. NeurDP uses a graph neural network (GNN) model to convert LPL to an intermediate representation (IR), which bridges the gap between source code and optimized binary. We also design an Optimized Translation Unit (OTU) to split functions into smaller code fragments for better translation performance. Evaluation results on datasets containing various types of statements show that NeurDP can decompile optimized binaries with 45.21% higher accuracy than state-of-the-art neural decompilation frameworks.

translated by 谷歌翻译

P3DC-Shot: Prior-Driven Discrete Data Calibration for Nearest-Neighbor Few-Shot Classification

Shuangmei Wang , Rui Ma , Tieru Wu , Yang Cao

分类：计算机视觉

2023-01-02

Nearest-Neighbor (NN) classification has been proven as a simple and effective approach for few-shot learning. The query data can be classified efficiently by finding the nearest support class based on features extracted by pretrained deep models. However, NN-based methods are sensitive to the data distribution and may produce false prediction if the samples in the support set happen to lie around the distribution boundary of different classes. To solve this issue, we present P3DC-Shot, an improved nearest-neighbor based few-shot classification method empowered by prior-driven data calibration. Inspired by the distribution calibration technique which utilizes the distribution or statistics of the base classes to calibrate the data for few-shot tasks, we propose a novel discrete data calibration operation which is more suitable for NN-based few-shot classification. Specifically, we treat the prototypes representing each base class as priors and calibrate each support data based on its similarity to different base prototypes. Then, we perform NN classification using these discretely calibrated support data. Results from extensive experiments on various datasets show our efficient non-learning based method can outperform or at least comparable to SOTA methods which need additional learning steps.

translated by 谷歌翻译

Edge Enhanced Image Style Transfer via Transformers

Chiyu Zhang , Jun Yang , Zaiyan Dai , Peng Cao

分类：计算机视觉

2023-01-02

In recent years, arbitrary image style transfer has attracted more and more attention. Given a pair of content and style images, a stylized one is hoped that retains the content from the former while catching style patterns from the latter. However, it is difficult to simultaneously keep well the trade-off between the content details and the style features. To stylize the image with sufficient style patterns, the content details may be damaged and sometimes the objects of images can not be distinguished clearly. For this reason, we present a new transformer-based method named STT for image style transfer and an edge loss which can enhance the content details apparently to avoid generating blurred results for excessive rendering on style features. Qualitative and quantitative experiments demonstrate that STT achieves comparable performance to state-of-the-art image style transfer methods while alleviating the content leak problem.

translated by 谷歌翻译

A RL-based Policy Optimization Method Guided by Adaptive Stability Certification

Shengjie Wang , Fengbo Lan , Xiang Zheng , Yuxue Cao , Oluwatosin Oseni , Haotian Xu , Yang Gao , Tao Zhang

分类：机器人 | 机器学习

2023-01-02

In contrast to the control-theoretic methods, the lack of stability guarantee remains a significant problem for model-free reinforcement learning (RL) methods. Jointly learning a policy and a Lyapunov function has recently become a promising approach to ensuring the whole system with a stability guarantee. However, the classical Lyapunov constraints researchers introduced cannot stabilize the system during the sampling-based optimization. Therefore, we propose the Adaptive Stability Certification (ASC), making the system reach sampling-based stability. Because the ASC condition can search for the optimal policy heuristically, we design the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm based on the ASC condition. Meanwhile, our algorithm avoids the optimization problem that a variety of constraints are coupled into the objective in current approaches. When evaluated on ten robotic tasks, our method achieves lower accumulated cost and fewer stability constraint violations than previous studies.

translated by 谷歌翻译

eVAE: Evolutionary Variational Autoencoder

Zhangkai Wu , Longbing Cao , Lei Qi

分类：神经与进化计算 | 机器学习

2023-01-01

The surrogate loss of variational autoencoders (VAEs) poses various challenges to their training, inducing the imbalance between task fitting and representation inference. To avert this, the existing strategies for VAEs focus on adjusting the tradeoff by introducing hyperparameters, deriving a tighter bound under some mild assumptions, or decomposing the loss components per certain neural settings. VAEs still suffer from uncertain tradeoff learning.We propose a novel evolutionary variational autoencoder (eVAE) building on the variational information bottleneck (VIB) theory and integrative evolutionary neural learning. eVAE integrates a variational genetic algorithm into VAE with variational evolutionary operators including variational mutation, crossover, and evolution. Its inner-outer-joint training mechanism synergistically and dynamically generates and updates the uncertain tradeoff learning in the evidence lower bound (ELBO) without additional constraints. Apart from learning a lossy compression and representation of data under the VIB assumption, eVAE presents an evolutionary paradigm to tune critical factors of VAEs and deep neural networks and addresses the premature convergence and random search problem by integrating evolutionary optimization into deep learning. Experiments show that eVAE addresses the KL-vanishing problem for text generation with low reconstruction loss, generates all disentangled factors with sharp images, and improves the image generation quality,respectively. eVAE achieves better reconstruction loss, disentanglement, and generation-inference balance than its competitors.

translated by 谷歌翻译

Translating Text Synopses to Video Storyboards

Xu Gu , Yuchong Sun , Feiyue Ni , Shizhe Chen , Ruihua Song , Boyuan Li , Xiang Cao

分类：计算机视觉

2022-12-31

A storyboard is a roadmap for video creation which consists of shot-by-shot images to visualize key plots in a text synopsis. Creating video storyboards however remains challenging which not only requires association between high-level texts and images, but also demands for long-term reasoning to make transitions smooth across shots. In this paper, we propose a new task called Text synopsis to Video Storyboard (TeViS) which aims to retrieve an ordered sequence of images to visualize the text synopsis. We construct a MovieNet-TeViS benchmark based on the public MovieNet dataset. It contains 10K text synopses each paired with keyframes that are manually selected from corresponding movies by considering both relevance and cinematic coherence. We also present an encoder-decoder baseline for the task. The model uses a pretrained vision-and-language model to improve high-level text-image matching. To improve coherence in long-term shots, we further propose to pre-train the decoder on large-scale movie frames without text. Experimental results demonstrate that our proposed model significantly outperforms other models to create text-relevant and coherent storyboards. Nevertheless, there is still a large gap compared to human performance suggesting room for promising future work.

translated by 谷歌翻译

Autonomous Driving Simulator based on Neurorobotics Platform

Wei Cao , Liguo Zhou , Yuhong Huang , Alois Knoll

分类：机器人 | 人工智能

2022-12-31

There are many artificial intelligence algorithms for autonomous driving, but directly installing these algorithms on vehicles is unrealistic and expensive. At the same time, many of these algorithms need an environment to train and optimize. Simulation is a valuable and meaningful solution with training and testing functions, and it can say that simulation is a critical link in the autonomous driving world. There are also many different applications or systems of simulation from companies or academies such as SVL and Carla. These simulators flaunt that they have the closest real-world simulation, but their environment objects, such as pedestrians and other vehicles around the agent-vehicle, are already fixed programmed. They can only move along the pre-setting trajectory, or random numbers determine their movements. What is the situation when all environmental objects are also installed by Artificial Intelligence, or their behaviors are like real people or natural reactions of other drivers? This problem is a blind spot for most of the simulation applications, or these applications cannot be easy to solve this problem. The Neurorobotics Platform from the TUM team of Prof. Alois Knoll has the idea about "Engines" and "Transceiver Functions" to solve the multi-agents problem. This report will start with a little research on the Neurorobotics Platform and analyze the potential and possibility of developing a new simulator to achieve the true real-world simulation goal. Then based on the NRP-Core Platform, this initial development aims to construct an initial demo experiment. The consist of this report starts with the basic knowledge of NRP-Core and its installation, then focus on the explanation of the necessary components for a simulation experiment, at last, about the details of constructions for the autonomous driving system, which is integrated object detection and autonomous control.

translated by 谷歌翻译